348 research outputs found

    Distributionally Robust Semi-Supervised Learning for People-Centric Sensing

    Full text link
    Semi-supervised learning is crucial for alleviating labelling burdens in people-centric sensing. However, human-generated data inherently suffer from distribution shift in semi-supervised learning due to the diverse biological conditions and behavior patterns of humans. To address this problem, we propose a generic distributionally robust model for semi-supervised learning on distributionally shifted data. Considering both the discrepancy and the consistency between the labeled data and the unlabeled data, we learn the latent features that reduce person-specific discrepancy and preserve task-specific consistency. We evaluate our model in a variety of people-centric recognition tasks on real-world datasets, including intention recognition, activity recognition, muscular movement recognition and gesture recognition. The experiment results demonstrate that the proposed model outperforms the state-of-the-art methods.Comment: 8 pages, accepted by AAAI201

    Adversarial Variational Embedding for Robust Semi-supervised Learning

    Full text link
    Semi-supervised learning is sought for leveraging the unlabelled data when labelled data is difficult or expensive to acquire. Deep generative models (e.g., Variational Autoencoder (VAE)) and semisupervised Generative Adversarial Networks (GANs) have recently shown promising performance in semi-supervised classification for the excellent discriminative representing ability. However, the latent code learned by the traditional VAE is not exclusive (repeatable) for a specific input sample, which prevents it from excellent classification performance. In particular, the learned latent representation depends on a non-exclusive component which is stochastically sampled from the prior distribution. Moreover, the semi-supervised GAN models generate data from pre-defined distribution (e.g., Gaussian noises) which is independent of the input data distribution and may obstruct the convergence and is difficult to control the distribution of the generated data. To address the aforementioned issues, we propose a novel Adversarial Variational Embedding (AVAE) framework for robust and effective semi-supervised learning to leverage both the advantage of GAN as a high quality generative model and VAE as a posterior distribution learner. The proposed approach first produces an exclusive latent code by the model which we call VAE++, and meanwhile, provides a meaningful prior distribution for the generator of GAN. The proposed approach is evaluated over four different real-world applications and we show that our method outperforms the state-of-the-art models, which confirms that the combination of VAE++ and GAN can provide significant improvements in semisupervised classification.Comment: 9 pages, Accepted by Research Track in KDD 201

    Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

    Full text link
    Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI), especially when compared with the remarkable progress made in fine-tuning Large Language Models (LLMs). While cutting-edge diffusion models such as Stable Diffusion (SD) and SDXL rely on supervised fine-tuning, their performance inevitably plateaus after seeing a certain volume of data. Recently, reinforcement learning (RL) has been employed to fine-tune diffusion models with human preference data, but it requires at least two images ("winner" and "loser" images) for each text prompt. In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion), where the diffusion model engages in competition with its earlier versions, facilitating an iterative self-improvement process. Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment. Our experiments on the Pick-a-Pic dataset reveal that SPIN-Diffusion outperforms the existing supervised fine-tuning method in aspects of human preference alignment and visual appeal right from its first iteration. By the second iteration, it exceeds the performance of RLHF-based methods across all metrics, achieving these results with less data.Comment: 28 pages, 8 figures, 10 table

    Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

    Full text link
    Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. Our result provides sample complexity bounds for distribution estimation using diffusion models. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. Furthermore, the generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution. The convergence rate depends on the subspace dimension, indicating that diffusion models can circumvent the curse of data ambient dimensionality.Comment: 52 pages, 4 figure
    • …
    corecore